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We establish bounds to the necessary resource consumption when building up cluster states for one-way 
computing using probabilistic gates. Emphasis is put on state preparation with linear optical gates, as the 
probabilistic character is unavoidable here. We identify rigorous general bounds to the necessary consumption of 
initially available maximally entangled pairs when building up one-dimensional cluster states with individually 
acting linear optical quantum gates, entangled pairs and vacuum modes. As the known linear optics gates have 
a limited maximum success probability, as we show, this amounts to finding the optimal classical strategy of 
fusing pieces of linear cluster states. A formal notion of classical configurations and strategies is introduced for 
probabilistic non-faulty gates. We study the asymptotic performance of strategies that can be simply described, 
and prove ultimate bounds to the performance of the globally optimal strategy. The arguments employ methods 
of random walks and convex optimization. This optimal strategy is also the one that requires the shortest storage 
time, and necessitates the fewest invocations of probabilistic gates. For two-dimensional cluster states, we find, 
for any elementary success probability, an essentially deterministic preparation of a cluster state with quadratic, 
hence optimal, asymptotic scaling in the use of entangled pairs. We also identify a percolation effect in state 
preparation, in that from a threshold probability on, almost all preparations will be either successful or fail. We 
outline the implications on linear optical architectures and fault-tolerant computations. 

PACS numbers: 



I. INTRODUCTION 

Optical quantum systems offer a number of advantages that 
render them suitable for attempting to employ them in archi- 
tectures for a universal quantum computer: decoherence is 
less of an issue for photons compared to other physical sys- 
tems, and many of the tools necessaryfor quantum state ma- 
nipulation are readily available fl H 11 II El H, 0] . Also, the 
possibility of distributed computation is an essentially built-in 
feature ClHlMl- Needless to say, any realization of a medium- 
scale linear optical quantum computer still constitutes an 
enormous challenge iflOfl . In addition to the usual require- 
ment of near-perfect hardware components - here, sources of 
single photons or entangled pairs, linear optical networks, and 
photon detectors - one has to live with a further difficulty in- 
herent in this kind of architecture: due to the small success 
probability of elementary gates, a very significant overhead in 
optical elements and additional photons is required to render 
the overall protocol near-deterministic. 

Indeed, as there are no photon-photon interactions present 
in coherent linear optics, all non-linearities have to be induced 
by means of measurements. Hence, the probabilistic character 
is at the core of such schemes. It was the very point of the cele- 
brated work of Ref. [Q]] that near-deterministic quantum com- 
putation is indeed possible using quantum gates (here: non- 
linear sign shift gates) that operate with a very low probability 
of success: only one quarter. Ironically, it turned out later that 
this value cannot be improved at all within the setting of lin- 
ear optics without feed-forward Hill . Essentially due to this 
small probability, an enormous overhead in resources in the 
full scheme involving feed-forward is needed. 

There is, fortunately, nevertheless room for a reduction of 
this overhead, based on this seminal work. Recent years saw 
a development reminiscent of a "Moore's law", in that each 
year, a new scheme was suggested that reduced the necessary 



resources by a large factor. In particular, the most promising 
results have been achieved yl |4|, |5[ by abandoning the stan- 
dard gate model of quantum computation fllOll in favor of the 
measurement-based one-way computer IU2I1 . Taking resource 
consumption as a benchmark, the most recent schemes range 
more than two orders of magnitude below the original pro- 
posal. It is thus meaningful to ask: How long can this devel- 
opment be sustained? What are the ultimate limits to overhead 
reduction for linear optics quantum computation? The latter 
question was one of the main motivations for our work. 

The reader is urged to recall that a computation in the one- 
way model proceeds in two step s. Firstly, a highly entan- 
gled cluster state HH [H, H [3 is built up. Secondly, local 
measurements are performed on this state, the outcomes of 
which encode the result of the computation. As the ability to 
perform local measurements is part of the linear optical tool- 
box, the challenge lies solely in realizing the first step. More 
specifically, one- and two-dimensional cluster states can be 
built from EPR pairs (16] using probabilistic so-called/ks/on 
gates. In the light of this framework, the question posed at the 
end of the last paragraph takes on the form: measured in the 
number of required entangled pairs, how efficiently can one 
prepare cluster states using probabilistic fusion gates? There 
have been several proposals along these lines in recent years 

It will be shown that the success probability of these gates 
can not be pushed beyond the currently known value of one 
half. Therefore, the only degree of freedom left in optimizing 
the process lies in adopting an optimal classical control strat- 
egy, which decides how the fusion gates are to be employed. 
This endeavor is greatly impeded by the gates' probabilistic 
nature: the number of possible patterns of failure and success 
scales exponentially (see Fig. O and hence deciding how to 
optimally react to any of these situations constitutes a very 
hard problem indeed. 
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Maybe surprisingly, we find that classical control has 
tremendous implications concerning resource consumption 
(which seems particularly relevant when building up struc- 
tures that render a scheme eventually fault-tolerant ll2lll22Tl ): 
even when aiming for moderate sized cluster states, one can 
easily reduce the required amount of entangled pairs by an 
order of magnitude when adopting the appropriate strategy. 
For the case of one-dimensional clusters, we identify a limit 
to the improvement of resource consumption by very tightly 
bounding from above the performance of any scheme which 
makes use of EPR pairs, vacuum modes and two-qubit quan- 
tum gates. In the two-dimensional setup, we establish that 
cluster states of size n x n can be prepared using 0(n 2 ) input 
pairs. 

We aim at providing a comprehensive study of the potential 
and limits to resource consumption for one-way computing, 
when the elementary gates operate in a non-faulty, but proba- 
bilistic fashion. As the work is phrased in terms of classical 
control strategies, it applies equally to the linear optical setting 
as to other architectures lfl7lll9il23ll . such as those involving 
matter qubits and light as an entangling bus lfl7i l24ll . This 
work extends an earlier report (Ref. Il20l0 where most ideas 
have already been sketched. 

II. SUMMARY OF RESULTS 

Although the topic and results have very practical implica- 
tions on the feasibility of linear optical one-way computing, 
we will have to establish a rather formal and mathematical set- 
ting in order to obtain rigorous results. To make these more 
accessible, we provide a short summary: 

• We introduce a formal framework of classical strategies 
for building up linear cluster states. Linear cluster states 
can be pictured as chains of qubits, characterized by 
their length I given in the number of edges. Maximally 
entangled qubit pairs correspond to chains with a single 
edge. By a configuration we mean a set of chains of spe- 
cific individual lengths. Type-I fusion allows for op- 
erations involving end qubits of two pieces (lengths l\ 
and l 2 ), resulting on success in a single piece of length 
h + h or on failure in two pieces of length 1% — 1 and 
?2 — 1. The process starts with a collection of N EPR 
pairs and ends when only a single piece is left. A strat- 
egy decides which chains to fuse given a configuration. 
It is assessed by the expected length, or quality Q(N) 
of the final cluster. The vast majority of strategies allow 
for no simple description and can be specified solely 
by a "lookup table" listing all configurations with the 
respective proposed action. Since the number of con- 
figurations scales exponentially as a function of the to- 
tal number of edges N, a single strategy is already an 
extremely complex object and any form of brute force 
optimization is completely out of reach. 

• After discussing the optimality of the primitive elemen- 
tary physical gates, operating with a success probability 
of p s = 1/2, we start by studying the performance of 



several simple strategies. In particular, we study strate- 
gies which we refer to as MODESTY and GREED: 

Greed : Always fuse the largest available 
linear cluster chains. 
Modesty : Always fuse the smallest available 
linear cluster chains 

in a configuration. Also, we investigate the strategy 
Static with a linear yield that minimizes the amount 
of sorting and feed-forward. 

• We find that the choice of the classical strategy has a 
major impact on the resource consumption in the prepa- 
ration of linear cluster states. When preparing clus- 
ter chains with an expected length of 40, the number 
of required EPR pairs already differs by an order of 
magnitude when resorting to MODESTY as compared 
to Greed. 

• We provide an algorithm that symbolically identifies the 
globally optimal strategy, which yields the longest av- 
erage chain with a given number N of initially available 
EPR pairs. This globally optimal strategy can be found 
with an effort of 0(\C^\{\og\C^\f). Here, \C^\ 
is the number of all configurations with up a total num- 
ber of up to N edges. 

• We find that MODESTY is almost globally optimal. For 
N < 46, the relative difference in the quality of the 
globally optimal strategy and MODESTY is less than 
1.1 x 1(T 3 . 

• Requiring significantly more formal effort, we provide 
fully rigorous proofs of tight analytical upper bounds 
concerning the quality of the globally optimal strategy. 
In particular, we find 

Q(N) < N/5 + 2. 

That is, frankly, within the setting of linear optics, in the 
sense made precise below, one has to invest at least five 
EPR pairs per average gain of one edge in the cluster 
state. 

• A key step in the proof is the passage to a radically 
simplified model - dubbed razor model. Here, clus- 
ter pieces are cut down to a maximal length of two. 
While this step reduces the size of the configuration 
space tremendously, it retains - surprisingly - essential 
features of the problem. The whole problem can then be 
related to a random walk in a plane lf25tl . and finally, to 
a convex optimization problem l26ll . This bound con- 
stitutes the central technical result. 

• The razor model also provides tools to get good numer- 
ical upper bounds with polynomial effort in N. 

• Similarly, we find tight lower bounds for the quality, 
based on the symbolically available data for small val- 
ues of N. 
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• We show that the questions (;) "given some fixed num- 
ber of input pairs, how long a single chain can be ob- 
tained on average?" and (ii) "how many input pairs are 
needed to produce a chain of some fixed length with (al- 
most) unity probability of success?" are asymptotically 
equivalent. 

• For two-dimensional structures, we prove that one can 
build up cluster states with the optimal, quadratic use 
in resources, even when resorting to probabilistic gates: 
for any success probability p s € (0, 1] of the physical 
primitive quantum gate, one can prepare anxn cluster 
state consuming 0(n 2 ) EPR pairs. Previously known 
schemes have operated with a more costly scaling. This 
is possible in a way that the overall success probability 
P s (n) — * 1 as n — > oo. That is, even for quantum gates 
operating with a very small probability of success p s , 
one can asymptotically deterministically build up two- 
dimensional cluster states using quadratically scaling 
resources. 

• For this preparation, we observe an intriguing perco- 
lation effect when preparing cluster states using proba- 
bilistic gates: from a certain threshold probability 

Ps > Pth 

on, almost all preparations of a n x n cluster will suc- 
ceed, for large n. In turn, for p s < p^, almost no prepa- 
ration will succeed asymptotically. 

• Also, cluster structures can be used for loss tolerant or 
fully fault tolerant quantum computing using linear op- 
tics. The required resources for the letter are tremen- 
dous, so the ideas presented here should give rise to a 
very significant reduction in the number of EPR pairs 
required. 

In deriving the bounds, we assumed dealing with a linear op- 
tical scheme 

• based on the computational model of one-way comput- 
ing on cluster states in dual-rail encoding. 

• using EPR pairs from sources as resource to build up 
cluster states, and allowing for any number of additional 
vacuum modes that could assist the quantum gates, 

• such that one sequentially builds up the cluster state 
from elementary fusion quantum gates. 

Sequential means that we do not consider the possible multi- 
port devices - as, e. g., in Ref. 112 311 - involving a large number 
of systems at a time (where the meaning of the asymptotic 
scaling of resources is not necessarily well-defined). In this 
sense, we identify the final limit of performance of such a 
linear optical architecture for quantum computing. 

Structurally, we first discuss the physical setting. After 
introducing a few concepts necessary for what follows, we 
discuss on a more phenomenological level the impact of the 
classical strategy on the resource consumption I2U l22tl . The 



Success (p s = 1/2) 
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FIG. 1 : Action of a fusion gate on the end qubits of two linear cluster 
states. 



longest part of the paper is then concerned with the rigor- 
ous formal arguments. Finally, we summarize what has been 
achieved, and present possible scopes for further work in this 
direction. 



III. 



PREPARING LINEAR CLUSTER STATES WITH 
PROBABILISTIC QUANTUM GATES 

A. Cluster states and fusion gates 



A linear cluster state 11211 is an instance of a graph state 
fl3l 0] of a simple graph corresponding to a line segment. 
Any such graph state is associated with an undirected graph, 
so with n vertices and a set E of edges, so of pairs (a, 6) of 
connected vertices. Graph states can be defined as those states 
whose state vector is of the form 



\G)= J] C/^ b )((|0) + |l))/2 

(a,b)eE 



1/2 



Cm 



where := |0)(0|( a ' <g> + |1) (1|^ ® a { z b) , a z denot- 

ing the familiar Pauli operator. In this basis, a linear cluster 
state vector of some length I is hence just a sum of all binary 
words on n qubits with appropriate phases. An EPR pair is 
consequently conceived as a linear cluster state with a single 
edge, / = 1 MM- A two-dimensional cluster state is the graph 
state corresponding to a two-dimensional cubic lattice. Only 
the describing graphs will be relevant in the sections to come; 
the quantum nature of graph states does not enter our consid- 
erations. 

As stated before, we call a quantum mechanical gate a 
(type-l)fusion gate JH if it can "fuse together" two linear clus- 
ter state "chains" with Zi and I2 edges respectively to yield a 
single chain of l\ + I2 edges (see Fig . [TJ . The process is sup- 
posed to succeed with some probability p s . In case of failure 
both chains loose one edge each: U 1— > U — 1. Unless stated 
otherwise, we will assume that p s = 1/2, in accordance with 
the results of the next section. 

This kind of quantum gate is, yet, insufficient to build up 
two-dimensional cluster states. For this to be possible, an- 
other kind of fusion gate is required: type-II fusion ^ to be 
discussed in SectionlXl 



B. Linear optical fusion gates 

We use the usual convention for encoding a qubit into pho- 
tons: In the so-called dual-rail encoding the basis vectors of 
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FIG. 2: Diagram representing how parity check gate can be em- 
ployed to realize a Bell state discriminating device. 




the computational Hilbert space are represented by 



10} 



| vac) 
|1) := al|vac) 



where 1 denote the creation operators in two orthogonal 
modes, and | vac) is the state vector of the vacuum. The canon- 
ical choice are two modes that only differ in the polarization 
degree of freedom, e. g. horizontal and vertical with respect 
to some reference, giving rise to the notation \H) := |0) and 

1*0 ■= |i>- 

Type-I fusion gates were introduced in Ref. J5[], where it 
was realized that the parity check gate J3l has exactly the de- 
sired effect. The gate's probability of success is p s = 1/2 and 
the following theorem states that this cannot be increased in 
the setting of dual rail coded linear optical quantum computa- 
tion. 

Theorem 1 (Maximum probability of success of fusion). The 

optimal probability of success p s of a type-I fusion quantum 
gate is p s = 1 /2. More specifically, the maximal p = Pi + P2 
such that 

A, = p{' a (\H)(H,H\-\V)(V,V\)/y/2, 
A 2 = ^ /2 (|I?)<I?,if| + |F)(^|)/V2 

are two Kraus operators of a channel that can be realized with 
making use of (i) any number of auxiliary modes prepared in 
the vacuum, ( ii) linear optical networks acting on all modes, 
and ( Hi) photon counting detectors is given by p = p s := 1/2. 

Proof. Given the setup in Fig. [2 we notice a parity check de- 
scribed by these Kraus operators can be used to realize a mea- 
surement, distinguishing with certainty two from four binary 
Bell states: The following Hadamard gate and measurement 
in the computational basis give rise to the Kraus operators 



D. 



(±1 



2- 1 '\(H\±{V\). 



On input of the symmetric Bell states with state vectors, 
10*) = 2- 1/2 (\H,H) ± \V,V}), the measurement results 
(A-l,BJ) and (A 2 ,B + ) indicate a \(j> + ) and (Ai,B+) and 
(A2, BJ) a \<j)~), respectively. These two states can be iden- 
tified with certainty. The anti-symmetric Bell states with state 
vectors {ip^) = 2" 1 / 2 (|i/, V) ± | V, H)), will in turn result in 
a failure outcome. 

Applying a bit-flip (a Pauli cr x ) on the second input qubit 
(therefore implementing the map j^*) 1— > \ip ), \tfi 1 ) 1— > 



FIG. 3: An example of a tree of successive configurations under 
application of a strategy. Light boxes group configurations. We start 
with N = 4. Dark boxes indicate where the strategy decided to apply 
a fusion gate. Possible outcomes are success (to the left) or failure (to 
the right), resulting in different possible future choices. The expected 
length of the final chain is Qm(4) = Q(4) = 13/8. 



\4> )) at random, a discrimination between the four Bell states 
with uniform a priori probabilities is possible, succeeding in 



50% of all cases. Following Ref. 12811 this is already the op- 
timal success probability when only allowing for (i) auxiliary 
vacuum modes, (ii) networks of beam splitter and phase shifts 
and (iii) photon number resolving detectors. Thus, a more re- 
liable parity check is not possible within the presented frame- 
work. □ 

In turn, it is straightforward to see that a failure necessar- 
ily leads to a loss of one edge each. Note that one could in 
principle use additional single-photons from sources or EPR 
pairs to attempt to increase the success probability p s of the 
individual gate. These additional resources would yet have to 
be included in the resource count. Such a generalized scenario 
will not be considered. 



IV. CONCEPTS: CONFIGURATIONS AND STRATEGIES 

The current section will set up a rigorous framework for the 
description and assessment of control strategies. All consid- 
erations concern the case of one-dimensional cluster states; 
the two dimensional case will be deferred to Section [X] Note 
that, having described the action of the elementary gate on the 
level of graphs, we may abstract from the quantum nature of 
the involved cluster states altogether. 



A. Configurations 

A configuration (in the identity picture) I is a list of num- 
bers Ik , k G N. We think of Ik as specifying the length of 
the k-th chain that is available to the experimenter at some 
instance of time. For most of the statements to come a more 
coarse-grained point of view is sufficient: in general we do 
not have to distinguish different chains of equal length. It is 
hence expedient to introduce the anonymous representation 
of a configuration C as a list of numbers Cj , i € N with C- L 
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specifying the numbers of chains of length i. We will always 
use the latter description unless stated otherwise. Trailing ze- 
roes will be suppressed, i. e. we abbreviate C = 1,2,0,... 
as C = (1, 2). Define the total number of edges (total length) 
to be L(C) = J2i i The space of all configurations is de- 
noted by C. By C^ N > we mean the set of configurations C hav- 
ing a total length less or equal to N. Lastly, let be the con- 
figuration consisting of exactly one chain of length i. This def- 
inition allows us to expand configurations as C = Y^Li Q e i- 



B. Elementary rule 

Let us re-formulate the action of the fusion gate in this lan- 
guage. An attempted fusion of two chains of length k and I 
gives rise to a map C = YliLi ^i e i ^> C = J2iLi ^i e i w i tn 

C = C - e k - ei + e k +i 

in case of success with probability p s = 1/2 (leading to a 
single chain of length I + k) and 

C = C - e k + e k -i - ei + e ; _i 

in case of failure, meaning that one edge each is lost for the 
chains of length k and I. All other elements of C are left 
unchanged. 



C. Strategies 

A strategy (in the anonymous picture) defines what action 
to take when faced with a specific configuration. Actions can 
be either "try to fuse a chain of length k with one of length I" 
or "do nothing". Formally, we will represent these choices by 
the tuple (k, I) and the symbol 0, respectively. It is easy to see 
that, in trying to build up a single long chain, it never pays off 
not to use all available resources. We hence require a strategy 
to choose a non-trivial action as long as there is more than one 
chain in the configuration. Formally, a strategy is said to be 
valid if it fulfills 

1. (No null fusions): S(C) = (k, I) => Cj,C* ^ 

2. (No premature stops): S(C) = C contains at most 
one chain. 

We will implicitly assume that all strategies that appear are 
valid. Strategies in the identity picture are defined completely 
analogously. 

An event E is a string of elements of {S, F}, denoting suc- 
cess and failure, respectively. The i-th component of E is 
denoted by Ei and its length by \E\. Now fix an initial con- 
figuration C$ and some strategy S. We write Ce for the con- 
figuration which will be created by S out of C® in the event 
E. Here, as in several definitions to come, the strategy S is 
not explicitly mentioned in the notation. It is easy to see that 
any strategy acting on some initial configuration will, in any 
event, terminate after a finite number of steps nj(C). 



Recall that the outcome of each action is probabilistic and a 
priori we do not know which Ce with \E\ = n will have been 
obtained in the ?i-th step. It is therefore natural to introduce a 
probability distribution on C, by setting 

Pn (C) :=2- n \{E:\E\=n,C E = C}\. 

In words: p c {C) equals 2~™ times the number of events that 
lead to C being created. The fact that S terminates after a fi- 
nite number of steps translates top llJ+k = p nj for all positive 
integers k. Expectation values of functions / on C now can be 
written as 

(f)pn :=5>„(C)/(C). 

c 

The expected total length is 

(L) Pn :=Y,Pn(C)iCi. 

C,i 

In particular, the expected final length is given by Q(Cg) := 
(L) PnT . Of central importance will be the best possible ex- 
pected final length that can be achieved by means of any strat- 
egy: 

Q(C t ) :=supQ s (C ). 
s 

This number will be called the quality of C®. For conve- 
nience we will use the abbreviations Q(N) := Q(Nei) and 
Q(N) := Q(JVei). 



V. SIMPLE STRATEGIES 

A priori, a strategy does not allow for a more economic de- 
scription other than a 'look-up table', specifying what action 
to take when faced with a given configuration. If one restricts 
attention to the set of configurations CW that can be reached 
starting from EPR pairs, \C^ N ' | values have to be fixed. 

The cardinality \C^ N > |, in turn, can be derived from the ac- 
cumulated number of integer partitions of k < N. The asymp- 
totic behavior l29ll can be identified to be 

w l + Q(N-i/«) (2N/3) v* 
1 1 (8tt 2 A)i/2 

which is exponential in the number N of initially available 
EPR pairs Poll . 

However, there are of course strategies which do allow for 
a simpler description in terms of basic general rules that ap- 
ply similarly to all possibly configurations. It might be sur- 
mised that close-to-optimal strategies can be found among 
them. Also, these simple strategies are potentially accessi- 
ble to analytical and numerical treatment. Subsequently, we 
will discuss three such reasonable strategies, referred to as 
Greed, Modesty, and Static. 



6 




FIG. 4: The process of fusion of the largest can be represented as a 
tree similar to a random walk. Reflection occurs at the dashed line 
(the largest string is lost and replaced with an EPR pair. Time evolves 
from top to bottom, thus decreasing the number of EPR resources. 
The horizontal dimension represents the length of the largest string. 



A. Greed 

This is one of the most intuitive strategies. It can be de- 
scribed as follows: "Given any configuration, try to fuse the 
largest two available chains". This is nothing but 



Sa(C) 



(k,l) 



if£iCi<l 
k = max{z 
I = max{z 



>0} 

- 5i,k > 0} 



Alternatively, one may think of GREED as fusing the first two 
chains after sorting the configuration in descending order. The 
rationale behind choosing this strategy is the following: fus- 
ing is a probabilistic process which destroys entanglement on 
average. Hence it should be advantageous to quickly build 
up as long a chain as possible. Clearly, the strategy's name 
stems from its pursuit of short-term success. From a theo- 
retical point of view, GREED is interesting, as its asymptotic 
performance can easily be assessed (see Fig. [5]): 

Lemma 2 (Asymptotic performance of GREED). The ex- 
pected length of the final chain after applying GREED to N 
EPR pairs scales asymptotically as 

Q G (N) = (2iV/ 7 r) 1 / 2 + (l). 



Proof. It is easy to see that an application of GREED to 
C% = Nei only generates configurations in {mei + e;,m = 
0, . . . , N; I = 0, . . . , N; l+m < TV}. This set is parametrized 
by m (the number of EPR resources) and I giving rise to the 
notation C = (I, m). By definition of Sq, whenever I > 1, 
the next fusion attempt is made on this longer chain and one 
of the other EPR pairs. As for the case I = we identify 
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FIG. 5: Expected length for the globally optimal strategy, for MOD- 
ESTY (in this plot indistinguishable from the former), for GREED, its 
asymptotic performance, and the lower bound for STATIC, as func- 
tions of even number N of initial EPR pairs. The inset shows GREED 
and MODESTY for small N, revealing the parity-induced step-like 
behavior. 



(0, m) with (1, m — 1) (when encountering (0, m) we distin- 
guish one of the m pairs). Therefore, in this slightly mod- 
ified notation we have with Ce = (l,m),l > in case 
of success Ces = (Z + l,m — 1) and in case of failure 
Cef = (I — lj m — 1), respectively. 

The tree in Fig.|4]can be obtained by reflecting the negative 
half of a standard random walk tree at I = and identifying 
the vertices with same m but opposite I. One can readily read 
off the expectation value of final chain's length. The form is 
especially simple in the balanced case (p s = 1/2), 



L(JV-1)/2J 

Q G (N) = 2 J2 



k=0 



■( N k ) 



The probabilities are twice the probabilities of the standard 
random walk tree, and the length-0 term has been omitted. 

Using an estimate using a Gaussian distribution we easily 
find the asymptotic behavior for large N (setting /i = pN and 

a 2 = p s (l - p,)N with p s = 1/2), 



2N 

n 



1/2 



2x 

2x exp ( — — ) da; + r(N) 



1/2 



T(l) 



r(N) 



with approximation error r(N) = o(l). 



□ 



The behavior of GREED changes qualitatively upon varia- 
tion of p s : For p s > 1/2, Qa(N) shows linear asymptotics in 
N, while in case of p s < 1/2 the quality Qg(N) is not even 
unbounded as a function of N. 

There is a phenomenon present in the performance of many 
strategies, which can be understood particularly easily when 
considering GREED: Q displays a "smooth" behavior when 
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FIG. 6: Expected length for MODESTY, the optimal strategy (where 
known) a lower bound to the quality as in Theorem[6] but with No = 
46 (for better visualization), and the upper bound attained with the 
razor model as functions of the number of initial EPR pairs N. 



available chains". In contrast to GREED this strategy intends 
to build up chains of intermediate length, making use of the 
whole EPR reservoir before trying to generate larger chains. 
Even though no long chains will be available at early stages, 
the strategy might nevertheless perform reasonably. Quite nat- 
urally, this strategy we will call MODESTY. 

Formally, this amounts to replacing max by min, i. e. re- 
placing descending order by ascending order: 



S M (C) = 



(k,l) 



if£^<l 
k = min{z : Ci > 0} 
I = min{z : C l - 6 it k > 0} 



Maybe surprisingly, MODESTY will not only turn out to give 
better results than GREED, but is actually close to being glob- 
ally optimal, as can be seen in Figures [5] and [6] See Section 
IVIBI for a closer discussion. 



C. Static 



regarded as a function on either only even or only odd values 
of TV. However, the respective graphs appear to be slightly 
displaced with respect to each other. For simplicity, we will 
in general restrict our attention to even values and explore the 
reasons for this behavior in the following lemma. 

Lemma 3 (Parity and Qg)- Let N be even. Then Qg(N) = 
Qg(N + 1). 

Proof. Let C = Ne u C' % = (N + l)e x , for N even. Now 
let E be such that S(C E ) = but 5(6*^ |E| _J ^ 0. As 
Greed does not touch the ?-th chain before the i-th step, it 
holds that C' E = C E + &\- Further, since type-I fusion pre- 
serves the parity of the total number of edges, C' E ^ 0. Hence 
C' E is of the form C' E = e\ + and one computes: 

Q G (C' E ) = l/2(fc + l) + l/2(fc-l) = fc = g G (C7 B ). 

From here, the assertion is easily established by re-writing 
QciC 1 ^) as a suitable average over terms of the form 
Qg(C' e ), where E fulfills the assumptions made above. □ 

As a corollary to the above proof, note that fusing an EPR 
pair to another chain does not, on average, increase its length. 
Hence the fact that Qg{N) grows at all as a function of N is 
solely due to the asymmetric situation at length zero. 

Lemma [5] explains the steps apparent in Fig. [5] Such steps 
are present also in the performance of MODESTY, to be dis- 
cussed now, and several other strategies - albeit not in such a 
distinct manner. 



B. Modesty 

There is a very natural alternative to the previously studied 
strategy. Instead of trying to fuse always the largest existing 
linear cluster states in a configuration, one could try the op- 
posite: "Given any configuration, try to fuse the smallest two 



Another strategy of particular interest is called STATIC, Ss- 
To describe its action, we need to define the notion of an in- 
sistent strategy. The term is only meaningful in the identity 
picture, which we will employ for the course of this section. 
Now, a strategy is called insistent if, whenever it decides to 
fuse two specific chains, it will keep on trying to glue these 
two together until either successful or at least one of the chains 
is completely destroyed. Formally: 

S{C E ) = (k, I) A {C EF ) k {C EF )i + S(C EF ) = S(C E ) 

Static acts by insistingly fuse the first chain to the second 
one; the third to the fourth and so on. After this first level, 
the resulting chains will be renumbered in the way that the 
outcome of the k-th pair is now the fc-th chain. At this point, 
Static starts over again, using the configurationjust obtained 
as the new input. This procedure is iterated until at most one 
chain of nonzero length has survived. 

The proceeding of Ss is somehow related to MODESTY 
and GREED, just without sorting the chains between fusion at- 
tempts. This results in much less requirements on the routing 
of the photons actually carrying the cluster states. From an ex- 
perimentalist's point of view, STATIC is a meaningful choice 
as it only requires a minimal amount of classical feed-forward 
that is only present at the level of fusion gates, not on the level 
of routing the chains. It performs, however, asymptotically 
already better than GREED (see Fig.|5]i. 

It turns out that STATIC performes rather poorly when act- 
ing on a configuration consisting only of EPR pairs. To cure 
this deficit, we will proceed in two stages. Firstly, the input is 
partitioned into blocks of eight EPR pairs each. Then MOD- 
ESTY is used to transform each block into a single chain. The 
results of this first stage are subsequently used as the input 
to Static proper, as described before. Slightly overloading 
the term, we will call this combined strategy STATIC as well. 
Note that, even when understood in this wider sense, STATIC 
still reduces the need for physically re-routing chains: the 
blocks can be chosen to consist of neighboring qubits and no 
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fusion processes between chains of different blocks are nec- 
essary during the first stage. The following theorem bounds 
Static's performance. For technical reasons, it is stated only 
for suitable TV. 

Theorem 4 (Linear performance of Static). For any m £ N, 
given TV = 2 3+m EPR pairs, Static will produce a single 
chain of expected length 

Q{N) > (137/1024)iV + 2. 

The proof of the above theorem utilizes the following 
lemma which quantifies the quality one expects when com- 
bining several configurations. 

Lemma 5 (Combined configurations). The following holds. 

1. Let C be a configuration consisting of single chains of 
respective length li, l 2 . Then iiTI/ 

Q(C) =h+l 2 -2 + 2 1 ~ nlin(h ' h '> >h+l 2 -2. (1) 

2. Let C(i) , . • . , CVfc) be configurations. Let S be a strat- 
egy that acts on ^\ C(a\ by first acting with S' on each 
of the C(i) and then acting insistently on the resulting 
chains. Then A32 



and on the other hand 

Q(((k)p i ,(lj) Pi ))>(L) Pi + (L) Pj -2. 

for any insistent strategy. Because these two quantities are 
bounded by the same value we will use this bound and replace 
averages over Q with Q of configurations of average lengths. 

We now iterate this scheme to obtain a single chain. A mo- 
ment of thought reveals that - as a result of our neglecting the 
2i-mm(ii j 2 )_ term _ me or( j er m w hich chains are fused to- 
gether does not enter the estimate for Qs- The claim follows. 

As for the third point: It follows by setting S' to the optimal 
strategy. □ 

Proof. ( of Theorem @) Consider a configuration consisting of 
n = 2 m chains of length x each. Using Lemma [5] one sees 
that the second stage of STATIC will convert it into a single 
chain of expected length Q(2 m e x ) > (x — 2)?? + 2. 

According to Section [VH MODESTY fulfils Qm(8) = 
Q(8) = 649/256. Applying Lemma [5] again we find with 
x = Q M (2 3 ) and TV = 2 3+m 



- , . 649/256-2 
)s{N) > '— TV + 2 

w 6.69 1(T 2 TV + 2. 



In case of p s ^ 1/2, 



137 
2048" 



-TV + 2 



□ 



Q' S (ne x ) > n(x - /initial) + I 



3. When substituting all occurences of Q by Q, the above 
estimate remains valid. 

Proof. Firstly, any strategy will try to fuse the only two chains 
in the configuration together until it either succeeds or the 
shorter one of the two is destroyed (after min(7i, l 2 ) unsuc- 
cessful attempts). In other word: in case of these special con- 
figurations any strategy is insistent. By Lemma|7] 

Q(h,h) = h + h-(T) 

min(/i J2) — 1 

= h + h- 2 ~ l 

i=0 

= h + l 2 - 2 + 2 1 - min{hM . 

For the second part, we run S' on each Cy,-n , resulting in k 
single chain configurations C!^ = &i i with probability distri- 
butions pi on C obeying Qs'(C'^) = (h) Pi - The joint distri- 
bution on C k is given by p = ]X pi. Now we fuse the chains 
together. If and are such (hat pi(C',^)pj(C^) 7^ 0, 



(i) 



0)- 



we unite them into one configuration C 
Clearly, C contains at most two chains which we fuse together 
as described in the first part of the Lemma. As Eq. (fTJ is lin- 
ear in the respective lengths of the chains in C, the distribution 
p' = piPj fulfills on the one hand 



(QV - (L) p , > (L) Pi + (L) 



initial 



can be obtained in the same way, where /initial = 2(1 — p s )/Ps 
(similar to n c in |l9ll ). Initial chains of length > initial can be 
produced by employing for example GREED, but disregarding 
the outcome in case of a fusion failure and aborting the pro- 
cess when 2(1 — p s )/p s is reached. Although large chains are 
produced with only a small overall success probability, this 
does not effect the linear asymptotics as this process only de- 
pends on p s , rather than TV. 



VI. COMPUTER-ASSISTED RESULTS 
A. Algorithm for finding the optimal strategy 

Before passing from the concrete examples considered so 
far to the more abstract results of the next sections, it would be 
instructive to explicitly construct an optimal strategy for small 
TV. Is that a feasible task for a desktop computer? Naively, 
one might expect it not to be. Since the number of strategies 
grows super-exponentially as a function of the total number of 
edges TV of the initial configuration, a direct comparison of the 
strategies' performances is quickly out of reach. Fortunately, 
a somewhat smarter, recursive algorithm can be derived which 
will be described in the following paragraph. 

The number of vertices in a configuration is given by 
V(C) := Y^ii Ci(rii + 1). An attempted fusion will decrease 
V(C) regardless of whether it succeeds or not. Now fix a 
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Vq and assume that we know the value of Q for all config- 
urations comprised of up to Vq vertices. Let C be such that 
V(C) = V + 1. It is immediate that 

Q{C) = max,,, (Q(S W C) + Q(F iJ C))/2, 

where C denotes the configuration resulting from suc- 
cessfully fusing chains of lengths ij and lj. FijC is defined 
likewise. As the r. .h. s. involves only the quality of configura- 
tions possessing less than or equal to Vq vertices, we know its 
value by assumption and we can hence perform the maximiza- 
tion in 0(c 2 ) steps. One thus obtains the quality of C and the 
pair of chains that need to be fused by an optimal strategy. 

The algorithm now works by building a lookup table con- 
taining the value of Q for all configurations up to a spe- 
cific V m . dx . It starts assessing the set of configurations with 
V(C) = 1 and works its way up, making at each step use of 
the previously found values. One needs to supply an anchor 
for the recursion by setting Q(e,) = i. Clearly, the memory 
consumption is proportional to |CW|, which is exponential 
in N and will limit the practical applicability of the algorithm 
before time issues do. 

We have implemented this algorithm using the computer al- 
gebra system Mathematica and employed it to derive in closed 
form an optimal strategy for all configurations in C^ 4e \ the 
quality of which is shown in Figures[5]and[6] A desktop com- 
puter is capable of performing the derivation in a few hours 

From the discussion above, it is clear that the leading 
term in the computational complexity of the algorithm is 
given by |C^|: every configuration needs to be looked 
at at least once. A straight-forward analysis reveals a 
poly-log correction; the described program terminates after 
0(\CW\(\og\CW\f) steps. 



in Section lVIIII ). Now, if there are k chains present in C, then 
a priori k — 1 successful fusions are needed before a strategy 
can terminate. If, however, in the course of the process one 
chain is completely destroyed, then k — 2 successes would 
already be sufficient. Therefore - paradoxically - within the 
given framework it pays off to destroy chains. Since shorter 
chains are more likely to become completely consumed due 
to failures, they should be subject to fusion attempts when- 
ever possible. This explains the first rule. 

There is one single scenario in which two chains can be de- 
stroyed in a single step; that is when one selects two EPR pairs 
to be fused together. Now consider the case where there are 
two chains of equal length in a configuration. If we keep on 
trying to fuse these two chains, then - in the event of repeated 
failures - we will eventually be left with two EPR pairs, which 
are favorable to obtain as argued before. Hence the second 
rule. 

We have thereby identified two competing tendencies of 
the optimal strategy. Obtaining a quantitative understanding 
of their interplay seems extremely difficult: deviating from 
Modesty at some point of time might open up the possibility 
of creating two chains of equal lengths many steps down the 
line. We hence feel it is sensible to conjecture that the glob- 
ally optimal strategy allows not even for a tractable closed 
description. A proof of its optimality seems therefore beyond 
any reasonable effort. One is left with the hope of obtaining 
appropriately tight analytical bounds - and indeed, the sec- 
tions to come pursue this programme with perhaps surprising 
success. 



VII. LOWER BOUND 



B. Data, intuitive interpretation, and competing tendencies 

Starting with C® = Nei, MODESTY turns out to be the 
optimal strategy for N < 10. For configurations contain- 
ing more edges, slight deviations from MODESTY can be ad- 
vantageous. The difference relative to Q(N) is smaller than 
1.1 x 10~ 3 for N < 46. More generally, two heuristic rules 
seem to hold: 

1 . It is favorable to fuse small chains (this is the dominant 
rule). 

2. It is favorable to create chains of equal length. 

Is there an intuitive model which can explain these findings? 
Several steps are required to find one. Firstly, note that every 
fusion attempt entails a 1/2 probability of failure, in which 
case two edges are destroyed. So "on average" the total length 
L(C) decreases by one in each step and it is natural to assume 
that the quality Q(C) equals L(C) minus the expected num- 
ber of fusion attempts a specific strategy will employ acting 
on C. Hence a good strategy aims to reach a single-chain 
configuration as quickly as possible, so as to reduce the ex- 
pected number of fusions (this reasoning will be made precise 



We will now turn to establishing rigorous upper and lower 
bounds to Q, so the quality of the optimal strategy. These 
bounds, in turn, give rise to bounds to the resource consump- 
tion any linear optical scheme will have to face. Lower bounds 
are in turn less technically involved than upper bounds. In 
fact, rigorous lower bounds can be based on known bounds 
for given strategies: For not too-large configurations, the per- 
formance of various strategies can be calculated explicitly on 
a computer (see Section [VTb. Any such computation in turn 
gives a lower bound to Q. The following theorem is based on 
a construction which utilizes the computer results to build a 
strategy valid for inputs of arbitrary size. This strategy is sim- 
ple enough to allow for an analytic analysis of its performance 
while at the same time being sufficiently sophisticated to yield 
a very tight lower bound for the quality, shown in Fig. [6] No- 
tably, the resulting statement is not a numerical estimate valid 
for small N, but a proven bound valid for all N: 

Theorem 6 (Lower bound for globally optimal strategy). 
Starting with N EPR pairs and using fusion gates, the glob- 
ally optimal strategy yields a cluster state of expected length 

Q(N) > Q(N ) + a(N - A ), (2) 
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for all N > Nq. The constants are 

N Q = 92, Q(N ) = 16.1069, 
a = (Q(N a ) - 2)/N = 0.153336. 
Rational expression are known and can be accessed at Ref. 

Proof. Denote by Q(N) the expected final length of some 
strategy acting on N EPR pairs. Fix No such that Q(N) is 
known for all N < 2N a and Q satisfies for N < N < 2N a 

(Q(N) - 2)/N > (Q(N ) - 2)/N (3) 

and that Eqn. © holds for all N < 2N Q . 

Now assume we are given N > 2No EPR pairs. Clearly, 
there are positive integers k > 2 and M < N such that N = 
kNo+M. Set m = N for i = 1, k-1 and n k = N +M. 
The n, fulfill £\ n i = N and N Q < n t < 2N . We partition 
the input into blocks of length rn each and compute 



Q 



> 



k 

E 



Q(m)-2(k-i) 



k 

> VgK)-2(fc-i) 



Q(N )+Y,' 



QK)-2 



i=2 
k 



> Q(N ) + Y,' 



i=2 



Q(Np) - 2 



Q(N ) + aJ2' 



i=1 

= Q(N ) + a(N -N ), 

where we made use of Lemma [5] and the assumptions men- 
tioned above. 

In the case of MODESTY the function Qm{N) can be ex- 
plicitly computed for not too large values of N. Indeed, the 
results for all N < 2N = 184 can be found at JH. They 
obey the condition in Eqn. (O and the statement follows with 
Qm(N ) = 16.1069. □ 



VIII. UPPER BOUNDS 

While the performance of any strategy delivers a lower 
bound for the optimal one, giving an upper bound is consid- 
erably harder. We will tackle the problem by passing to a 
family of simplified models. For every integer R > 2, the 
razor model with parameter R is defined by introducing the 
following new rule: after every fusion step all chains will be 
cut down to a maximum length of R. Obviously, the full prob- 
lem may be recovered with R > N. Given the complexity of 
the problem, it comes as a surprise that even for parameters as 
small as R = 2 the essential features of the full setup seem to 
be retained by the simplification, in the sense that understand- 
ing the razor model yields extraordinary good bounds for Q. 




FIG. 7: Performance of the optimal strategy in the razor model (R = 
2 and R = 3), the full model (R — N) and the upper bound attained 
with the R = 2 razor model. The inset shows the convergence of the 
upper bound to the quality (based on the razor model with parameter 
R) vs. the razor parameter R — 2, .... 10 for N — 30 together with 
the optimal value Q(30). 



A. The razor model - outline 

In the spirit of Section |IV] a configuration in the razor 
model is specified by a vector in N fl . Thus, the number 
of configurations with a maximum total number of N edges 
is certainly smaller than N R , which is a polynomial in N. 
Adapting the techniques presented in Section [VTl we can ob- 
tain the optimal strategy with polynomially scaling effort. We 
have thus identified a family of simplified problems which, in 
the limit of large R, tend to become exact, and where each 
instance is solvable in polynomial time. 

How do the results of the razor model relate to the original 
problem? Clearly, for small values of R, Q riZ0 r(C) will be 
a very crude lower bound to Q(C). However, as indicated in 
Section lVll the quality of a configuration C can be assessed in 
terms of the optimal strategy's expected number of fusion at- 
tempts (T(C)) when acting on C. It is intuitive to assume 
that (T) < (T) razor , as the "cutting process" increases the 
probability of early termination. We will thus employ the fol- 
lowing argument: for a given configuration C, derive a lower 
bound for (T(C)) razor , which is in particular a lower bound 
for (T(C)), which in turn gives rise to the upper bound 

Q(C) < L{C) - (T(C)) 

forQ. 

The results of this ansatz are extremely satisfactory. Fig. [7] 
shows the performance of the optimal quality for various R, 
and the convergence when increasing R. 

The intuitive explanation for the success of the model is the 
observation that the chance that a chain of length R is built up, 
and eventually again disappears, is exponentially suppressed 
as a function of R. That is, the crucial observation is that the 
error made by this radical modification is surprisingly small. 
A rigorous justification for this reasoning is supplied by the 
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following two propositions which will be proved in the next 
section. 

Lemma 7 (Quality and attemped fusions). The expected final 
length (L) equals the initial number of edges L(Cq) minus the 
expected number of attempted fusions (T). 

Theorem 8 (Bound to the full model from the razor model). 
Let C*0 G C be a configuration. The optimal strategy in the 
setting of the razor model will use fewer fusion attempts on 
average to reach a final configuration starting from C$ than 
will the optimal strategy of the full setup. 



B. The razor model - proofs 

For the present section, it will prove advantageous to intro- 
duce some alternative points of view on the concepts used so 
far. Recall that a strategy is a function from configurations 
to actions. However, once we have fixed some initial config- 
uration C*0, we can alternatively specify a strategy as a map 
from events to actions. Indeed, the configuration present af- 
ter n steps is completely fixed by the knowledge of the initial 
configuration, the past decisions of the strategy and the suc- 
cession of failures and successes. We will call the resulting 
mapping the decision function £ ) s , ,c an d will suppress the 
indices whenever no danger of confusion can arise. In the 
same spirit, we are free to conceive random variables on C as 
real functions / : {S, F} n — ► R. Expectation values are then 
computed as 

(/) := </>(P» T ) E 2-^/(1?). 

E,\E\=ni 

Quantities of the form (f)(C) for some configuration C refer 
expectation values (/) given the initial configuration C$ = C. 

An interesting class of random variables can be written in 
the form 



\E\ 



(4) 



where 0/ is some function of events and Si,...,, denotes the 
restriction of E to its first i elements. A simple example is the 
amount of lost edges M(E) that was suffered as a result of E. 
Here, 



(Ex 



2, Ek = F A D(Ex i- x ) £ 

0, else. 



(5) 



Let us refer to observables as in Eq. as additive random 
variables. The following lemma states that when evaluating 
expectation values of additive variables, only their step-wise 
mean 

0OE71...O := i - 1 ,S) + (t>(E li ... ti - 1 ,F))/2 

enters the calculation. 



Lemma 9 (Expectation values of additive random variables). 
Let f be an additive random variable. Set 



f(E) := J2 KEi. 



Then (/) = (/). 

Proof. Set n = nj. We then have, by definition, 

n 

(/) = 2_n E E^i....,*) 

E,\E\=n i=l 

n 

= E 2_< £ m 

i=l E,\E\=i 
n 

= E 2 ~ l E ^) = </>- 

i=l E,\E\=i 



□ 



Note that 



</W(-Ei,...,i) 



1, 
0, 



D(Ei, 
else, 



0* 



in other words, 4>m counts the number of attempted fusions 
T. Using Lemma [9] we see that the expected number of lost 
edges equals the expected number of fusion attempts: (M) = 
(T) . This proves Lemma|7] 

In the following proof of Theorem [8] we will employ the 
identity picture introduced in Section [IV] The argument is 
broken down into a series of lemmas. 

Lemma 10 (More is better than less). Let I be a configura- 
tion. Then, for all i, Q(I + e^) > Q(I). 

Proof. The proof is by induction on two parameters: on the 
number of chains |C| and on the total length L(C). To base 
the induction in both variables, we note that the claim is trivial 
if either \C\ < 1 orL < 2. 

Now consider any configuration C. Let S be the optimal 
strategy and denote by Cs and C p the configurations created 
by S(C) in case of success and failure respectively. It is sim- 
ple to check that S(C) acting on C + ti yields Cs + e 2 : or 
Cf + &%■ Hence 



Q(C + e t 



> 



l/2(Q(C s + e t )+Q(C F + ei )). 



But unless |C| < 1 we have that in any event E E {S, F} 
either \Cp\ < \C\ or L(Cp) < L(C) and thus the claim 
follows by induction. □ 

Lemma 11 (Winning is better than losing). Let C € C, let Cs 

be the configuration resulting from the action of the optimal 
strategy on C in the case of success, let Cp be the obvious 
analogue. Then Q(Cs) > Q(Cp). 
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Proof. Let (k,l) be the action defined above. Clearly, Cf — 
C - e k - e h By the last lemma, Q{C F ) < Q(C). But Q{C) 
is the average of Q(C F ) and Q(Cs)\ hence 



Q{.C S ) > Q(C) > Q(C F ). 



□ 



Lemma 12 (No catalysis). Let C <EC. Then, for all i, Q(C + 

et) < Q{C) + 1. 

Proof. We show the equivalent statement: for C and i s. t. 
d ^ it holds that Q(C - e*) > Q(C) - 1. Once more, the 
proof is by induction on \C\, L and the validity of the claim 
for |C| < 1 or L < 2 is readily verified. 

Let C, S, Cs , Cf be as in the proof of Lemma \TU\ If the 
application of S(C) and the subtraction of commute, we 
can proceed as we did in Lemma [10] A moment of thought 
reveals that this is always the case if not d = 1 and S(C) = 
(i, k) (or, equivalently, (k, i)) for some k. In fact, in this case 
we have 



C s 
C F 



(...,k + l k ,...) 



1 li 



Jk - L 



so that Cf — &i would take on a negative value at the i-th 
position. Note, however, that C — = Cs — e». By induction 
it holds that Q(Cs — ei) > Q(Cs) — 1 and further, by Lemma 
HU Q(C S ) - 1 > Q(C) - 1 which concludes the proof. □ 

Lemma 13 (Fewer edges - fewer fusions). Let C G C,i be 
such that Ci ^ 0. Then 

(T)(C- ei )<(T)(C), 

where the expectation values are taken with respect to the re- 
spective optimal strategies. 

Proof. We will show that, for every C G C, the optimal strat- 
egy acting on C := C — ej will content itself with a lower 
number of average fusion attempts (T)(C) than will the op- 
timal strategy acting on C. Recall that Lemma|7]states 

Q(C) = L(C)-(T)(C). 

Combining this and Lemma[T2lwe find 

Q(C) > Q{C) - l 
^ L(C) - 1 - (T)(C) > L{C) - (T)(C) - 1 
# (T)(C')<(T)(C). 

□ 

We are finally in a position to tackle the original problem. 

Proof, (of Theorem^ Let C$ be some configuration. We 
will build a strategy which is valid on Cq in the razor model 
and uses a fewer number of expected fusions than the opti- 
mal strategy in the full setup. Define the shaving operator 
R : C — > C which sets the length of each chain of length i 
in the configuration it acts on to max(i, R). By a repeated 




final configurations 



FIG. 8: The configuration space of the R — 2 razor model is No x 
No- Only the three actions a, b and c are available to reach the final 
configurations (exactly one EPR pair or GHZ state, or no chain at 
all), starting from the initial configuration that consists of N EPR 
pairs. 



application of the relation stated in Lemma [13] we see that 
(T)(RC) < (T)(C). 

We build the razor model strategy's decision function D' 
inductively for all events in £i, for increasing i. Consider an 
event E € £j. Denote by C' E the configuration resulting from 
C*0 under the action of D 1 in the event of E. C' E is well- 
defined as only the values of D' for events with length smaller 
than i enter its definition. Set D'(E) to the action taken by the 
optimal strategy for RC' E . 

It is simple to verify that D' defines a valid strategy for the 
razor model. By the results of the first paragraph, the expected 
number of fusions decreased in every step of the construction 
of D'. The claim follows. □ 



C. An analytical bound - random walk 

Finally, we are in a position to prove an analytic upper 
bound on the yield of any strategy building one-dimensional 
cluster chains. Quite surprisingly, the description given by the 
razor model with a rather radical parameter of R = 2 is still 
faithful enough to deliver a good bound as will be explained 
now. 

In the R = 2-model configurations are fully specified by 
giving the number of EPR pairs n\ and of chains of length two 
?i2 they contain. Hence the configuration space is No x No and 
we can picture it as the positive quadrant of a two-dimensional 
lattice. In each step a strategy can choose only among three 
non-trivial actions: 

(a) Try to fuse two EPR pairs. We call this action a for 
brevity. Let Cs be the configuration resulting from a 
successful application of a on C. Define the vector 
as G 7L x TL as as := Cs — C. An analogous defi- 
nition for a p and some seconds of thought yield 



as 
dp 



■= (-2,1), 
:= (-2,0). 
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(b) Try to fuse two chains of length one and two, respec- 
tively. In the same manner as above we have 



b s 
b F 



(-1,0), 

(0,-1). 



(c) Try to fuse two chains of length two. 

cs ■■= (0,-1), 
c F := (2,-2). 

The objective is to bound from below the minimum number 
of non-trivial actions taken on average. Initially, we start with 
TV EPR pairs, so C$ = (N,0). As configuration space is 
a subspace of No x No, we can describe the situation by a 
random walk in a plane. 

Any strategy will apply the rules a, b, c until one of the 
points (0,0), (1,0), (0, 1) is reached (illustrated in Fig. [8]). 
Our proof will be lead by the following idea: by applying one 
of the three non-trivial actions to a configuration C, we will 
move "on average" by 

a := (a s + a F )/2 = (-2, 1/2), 
b := (-1/2,-1/2) or 
c := (1,-3/2), 

respectively. The minimum number of expected fusion steps 
should then be given by the minimum number of vectors from 
{a, b, c} one has to combine to reach the origin starting from 
(N, 0). This procedure amounts to an interchange of two av- 
erages. The aim is to reach the origin or a point with distance 
one to it on average as quickly as possible. 
To make this intuition precise, set 



5 (-Ei 



I Hi: l i)i 



Recall that D(E\ is one of {a, b, c, 0}. Given the event 

E, 4>s(E) is the last action applied to the configuration. For 
any event E = {S, F} n we require that 



5(E) :=J2MEi,...,i) < HV + 1,1), 



(6) 



which implies in particular that the same bound holds for (6). 
Define a(E) to be the number of times the strategy will have 
decided to apply rule "a" in the chain of events {Ei ^ | i = 
1, . . . , \E\} leading up to E. Formally 



1. 

0, 



£>(£i,...,i-i) 

else, 



a, 



and a(E) = ' <i>a( E h-,i)- Fur ther, 

<j>s(Ei,...,i) = <Pa(Ei i ...j)a H h (f) c (E 1: ... ii )c, 

where <j> c are defined in the obvious way. It follows that 

(5) = (S) = (a)a+ (b)b+ (c)c < (-N + 1,1), (7) 
(T) = (a) + (b) + (c). (8) 



D. An analytical bound - convex optimization program 

Therefore, if (T) originates from a valid strategy it is neces- 
sarily subject to the constraints put forward in Eqs. ( 1718b . For 
each N £ N, a lower bound for the minimum expected num- 
ber of losses is thus given by a linear program, so a certain 
convex optimization problem: We define 



B := 



-2 1/2 
-1/2 -1/2 
1 -3/2 



Then, this lower bounds can be derived from the optimal so- 
lution of the linear program given by 



minimize 
subject to 



(1,1, IK 

xB < (-AT + 1,1), 
x > 0, 



where the latter inequality is meant as a component-wise pos- 
itivity. This is a minimization over a vector x £ R 3 . In this 
way, the performance of the razor model is reduced to solv- 
ing a family of convex optimization problems. According to 
Lemma[l4] the solution of this linear program delivers the op- 
timal objective value satisfying 



(T) = 4A/5 - 2 



for N > 6. 



Lemma 14 (Duality for linear program). The optimal objec- 
tive values of the family of linear programs 

minimize ( 1 , 1 , 1 ) x T 
subject to xB < (-N + 1, 1), 

x > 0, 



are given by 



opt 



o, 

(A-l)/2, 
(4(A-l)-6)/5, 



N = l, 

A = 2,...,5, 

N > 6. 



Proof. This can be shown making use of Lagrange duality for 
linear programs. The dual to the above problem, referred to as 
primal problem, is found to be 

maximize (A— 1,— l)y T 
subject to -yB T < (1, 1, 1), 

y>o. 

This is a maximization problem in y £ 1R 2 , again a linear 
program (moreover, a duality gap can never appear, i. e., the 
objective values of the optimal solutions of the primal and the 
dual problems are identical). By finding - for each N - a 
solution of the dual problem, which is assumed by the primal 
problem, we have hence proven optimality of the respective 



14 



solution. For all N, this family of solutions can be determined 
to be 

(0,0), N=l, 
(1/2,0), N = 2, 5, 
(4/5,6/5), N > 6. 

It is straightforward to show that these are solutions of the dual 
problem, and that the respective objective values are attained 
by appropriate solutions of the primal problem, e. g. 

(0,0,0), N = l, 

x={ ((7V-l)/2,0,0), N = 2, 5, 
(2JV/5,2(JV/5- 1),0), N > 6. 

The solutions yield the objective values stated in the lemma. 

□ 

We subsequently highlight the consequence of this proof: 
we find the bound to the quality of the globally optimal strat- 
egy: this shows that asymptotically (for p s = 1/2) at least five 
EPR pairs have to be invested on average (see also the subse- 
quent section) per single gain of an edge in the linear cluster 
state. 

Corollary 15 (Upper bound to globally optimal strategy). The 
quality of the optimal strategy for N > 6 is bounded from 
above by 

Q(N) < N/5 + 2. 
This is one of the main results of this work. 



IX. AN INVERSE QUESTION 

Recall that so far we treated the problem "given some fixed 
number of input pairs, how long a single chain can be obtained 
on average?". It is also legitimate to ask "how many input 
pairs are needed to produce a chain of some fixed length with 
(almost) unity probability of success?". After all, we might 
need just a specific length for a given task. In the present 
section we establish that both questions are asymptotically 
equivalent, in the sense that bounds for either problem imply 
bounds for the other one. 

Theorem 16 (Resources for given resulting length, upper 
bounds). Let S be some strategy, let 

Q S (N) >aN + (3 

be a lower bound to its yield for some a, f3 G 1R and all N > 
No. Choose an e > 0. Then there exists a strategy S' such 
that, if S' acts on (l/a+e)L EPR pairs, it will output a single 
chain not shorter than L with probability approaching unity 
as L — » oo. 

Proof. Choose a number b e N. Set N = (1/a + e)L. 
There are arbitrary large L such that b divides N and we will 
presently assume that L has this property. We comment on the 
general case in the end. 



The strategy S' proceeds in two stages, labeled I and II, to 
be analyzed in turn. Firstly, we divide the N input pairs into 
B = N/b blocks of size b and let S run on each of these 
blocks. 

Denote by Ni the random variable describing the final out- 
put length of the i-th block, i = 1, ...,B. The N are in- 
dependent, identically distributed variables satisfying (Ni) > 

aN + f3. Set N\ = Y^,f=i ( tne roman 1 signifies that we 
are dealing with the expected total length after the first stage 
of S'). As the Ni are independent, the variance of jVj equals 
Bo 2 , where a 2 < oo is the variance of any of the Ni. By 
Chebychev's inequality we have 



P 



\Ni - (Ni)\ > B 3/i < Var^).^ 3 / 2 
= a 2 B^ 2 . 



In other words, the relation \N\ — (Ni) \ < _B 3 / 4 holds almost 
certainly if we let L (and hence B) go to infinity for any fixed 
b. The same is true in particular for the weaker statement 

Ni > (Ni) - B 3/4 >B(ab + (3) - B 3/4 . 

In the second stage II, S' builds up a single chain out of the 
B ones obtained before. Irrespective of how S' goes about 
in detail, the process will stop after exactly B — 1 successful 
fusions. Now choose any 5 > 0. We claim that asymptotically 
no more than (1+S)(B — 1) failures will have occurred before 
the strategy terminates. Indeed, consider an event E of length 
2(1+5/2) (B—l). By the law of large numbers, E contains no 
fewer than B—l successes and not more than (1 + S)(B — 1) 
failures, almost certainly as B — > oo. Hence the final output 
length Nil fulfills 



P 



N u > B(ab + (3) - B 3/4 - 2(1 + 5)B 



1 



as B — > oo. Plugging in the definitions of B, N, the r. h. s. of 
the estimate takes on the form 



L + L(ea + 1 /i(a, /?, 5, e)) - {j / 2 (a, e)) 



3/4 



where f\ , f 2 are some (not necessarily positive) functions of 
the constants. By choosing the block length b large enough, 
we can always make the second summand positive. For large 
enough L, the positive second term dominates the negative 
third one and hence Nu > L almost certainly as L — > oo. 

Lastly consider the case where L is such that b does not 
divide N. Choose L > b/e. We can decompose N = kb + r 
where r < b and hence r/L < b/L < e. Set e' = e — r / L. 
By construction N' — (1/a — e')L divides b and therefore 
already N' < N input pairs are enough to build a chain of 
length L asymptotically with certainty. □ 

Theorem 17 (Resources for given resulting length, lower 
bounds). Let 

Q(N) <aN + /3 

be some upper bound to the optimal strategy's performance. 
Choose an e > 0. Then there exists no strategy S' such that, 
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if S' acts on (1/a — e)L EPR pairs, it will output a single 
chain not shorter than L with probability approaching unity 
as L — > oo. 

Proof. Assume there is such a strategy S'. Then 



lim >(l/a-e)- 1 >a. 



Hence Qs'{N) is eventually larger than Q(N), which is a 
contradiction. □ 

Suppose one aims to build a linear cluster state of length N, 
Combining the results of the present section with the findings 
of Sections IVII1 1 Villi yields that the goal is achievable with 
unit probability if more than 6.67V EPR pairs are available. 
Similarly, one will face a finite probability of failure in case 
there are less than 5N input chains. Both statements are valid 
asymptotically for large N, 



X. TWO-DIMENSIONAL CLUSTER STATES 
A. Preparation prescription 

We finally turn to the preparation of two-dimensional clus- 
ter states, which are universal resources for quantum computa- 
tion To build up a two-dimensional n x n cluster 
state clearly requires the consumption of 0(n 2 ) EPR pairs. 
That this bound can actually be met constitutes the main re- 
sult of this section: this question had been open so far, with all 
known schemes exhibiting a worse scaling. From our previ- 
ous derivations, we already know that length-n linear cluster 
chains can be built consuming 0(n) entangled pairs. Hence it 
suffices to prove that linear chains with an accumulated length 
of 0(n 2 ) can be combined to an n x rc-cluster. Consequently, 
for the constructions to come, we will employ linear chains - 
as opposed to EPR pairs - as the basic building blocks. 

Again, to actually connect two chains to form a two- 
dimensional structure, probabilistic gates from arbitrary ar- 
chitectures may be utilized. The following claim will hold for 
gates that delete a constant amount of edges from the partic- 
ipating chains on failure (maybe unequal for the two chains), 
but not splitting them (no a z error outcome). In case of suc- 
cess it shall create cross-like structures, again deleting a cer- 
tain amount of edges (see Fig.fTOl). In particular, the quadratic 
scaling as such is not altered by a possibly small probability 
of success p s < 1/2. 

The main problem faced is to find a preparation scheme that 
does not 'tear apart' successfully prepared intermediate states 
in case of a failed fusion. The challenge will be met by (a) 
switching from type-I to type-II fusion (Section [XB4 and (b) 
employing the pattern shown in Fig. l9l(Section lXCb . 



FIG. 9: A possible pattern of how to arrange n + 1 linear clusters to 
build a two-dimensional cluster of width n. Fusion operations have 
to be applied at the black circles along the long linear cluster state. 
Free ends carrying spare overhead are shown as arrows. 



Hence the related type-II fusion gate JUl with a more suit- 
able error outcome will be used. How this one actually acts is 
shown in Fig.fTOl 

In preparation of a fusion attempt, a "redundantly encoded" 
qubit with two photons (see JUl) is produced in one chain by 
a a x measurement, which consumes two edges (giving rise 
to another 2n 2 edges). Now the fusion type-II gate creates 
a two-dimensional cross-like structure on success when be- 
ing applied to one of the photons in the redundantly encoded 
qubit and one of the other chain's qubits. In case of failure it 
acts like a a x measurement, therefore decreasing the encod- 
ing level of the redundancy encoded qubit by one and deleting 
two edges from the other chain, leaving us with a redundancy 
encoded qubit there. Hence, we may apply the fusion type- 
II again without any further preparation, deleting two edges 
on successive failures from the two chains alternatingly. For 
convenience we assume that we lose two edges per involved 
chain per failure instead. This increases the overhead require- 
ment roughly by a factor of two but allows us to forget about 
the asymmetry in the fusion process. Hence, in the following 
any resource requirements will be given in terms of double 
edges instead of single ones. 

Similar to the type-I case, the optimal success probability 
can be found. Actually this type of fusion gate should per- 
form a Bell state measurement, hence p s < 1/2 l28ll . In fact, 
the gate proposed in Ref. |5j] consists of the parity check, the 
Hadamard rotation and measurement of the second qubit (see 
Fig. |2]i with two additional Hadamard gates applied before 
(which only map Bell states onto Bell states). 



C. Asymptotic resource consumption for near-deterministic 
cluster state preparation 

Theorem 18 (Quadratic scaling of resource overhead). For 

any success probability p s G (0, 1] of type-II fusion, an n x n 
cluster state can be prepared using 0(n 2 ) edges in a way such 
that the overall probability of success approaches unity 



P s {n) 



B. Linear optical type-II fusion gate 

As for linear optics fusion gates, an error outcome in the 
type-I gate would tear each chain apart where we tried to fuse. 



Proof. The aim is to prepare annxn cluster state, starting 
from n+1 one-dimensional chains. For any integer I, starting 
point is a collection of n one-dimensional chains of length 
m = n + I, and a single longer chain of length L = n(l + 



16 



• • • »- • • 




FIG. 10: The elementary linear optics tools for building two- 
dimensional structures from linear cluster chains. From top to bot- 
tom: a a z measurement to remove unneeded nodes, a a x measure- 
ment to create a redundancy encoded qubit in preparation of type-II 
fusion. The last figure shows the action of a fusion type-II attempt. 



1), referred to subsequently as thread. In order to achieve 
the goal, a suitable choice for a pattern of fusion attempts is 
required. One such suitable "weaving pattern" is depicted in 
Fig. [9] Here, solid lines depict linear chains, whereas dots 
represent the vertices along the thread where fusion gates are 
being applied. 

The aim will then be to identify a function n i— » g(n) such 
that the choice m = g(n) leads to the appropriate scaling of 
the resources. In fact, it will turn out that a linear function is 
already suitable, so for an a > l/p s we will consider g(n) = 
an. This number 

m — n = g{n) — n = (a — l)n 

quantifies the resource overhead: in case of failure, one can 
make use of this overhead to continue with the prescription 
without destroying the cluster state. If this overhead is too 
large, we fail to meet the strict requirements on the scaling of 
the overall resource consumption, if it is too small, the prob- 
ability of failure becomes too large. Note that there is an ad- 
ditional overhead reflected by the choice L. This, however, is 
suitably chosen not to have an implication on the asymptotic 
scaling of the resources. 

Given the above prescription, depending on n, the overall 
probability P s (n) of succeeding to prepare an n x n cluster 
state can be written as 

P.(n) =7r s (n) n . 

Here, 

k=0 ^ ' 

is the success probability to weave a single chain of length 
an into the carpet of size n with the binomial quantifying 
the number of ways to distribute k failures on n nodes IB411 . 
p s > and 1 — p s are the success and failure probabilities 
for a fusion attempt, respectively. It can be rephrased as the 



probability to find at least n successful outcomes in an trials, 

an / \ 

Mn) = £(i-ftr-*rf(?) 

k=n ^ ' 

= 1 — F (n — 1, an,p s ). 

Here, F denotes the standard cumulative distribution function 
of the binomial distribution l35ll . Since n — 1 < anp s for all 
n, as a > l/p s is assumed, we can hence bound irJn) from 
below by means of Hoeffding's inequality |[36l 13711 . provid- 
ing an exponentially decaying upper bound of the tails of the 
cumulative distribution function. This gives rise to the lower 
bound 

/ 2{anp s -n + l) 2 \ 

7r s (n) > 1 — exp . 

\ an J 

Now, again since a > l/p s , we have that 

7T™ > (1 - exp(- C 7i)) n 

with c := 2(ap s — l)/a > 0. Further, for any k £ N there 
exists an n £ IN such that for all n > n 

(1 - exp(-cn))" > (1 - l/(fcn)) n . 

Noticing 

lim {l-l/(kn)) n ^c-^ k 

n — >oo 

we can find for any e > a k satisfying 1 — c~ 1//fe < e. 
Therefore, for any e > it holds that ]im n _» 00 P s > 1 — e. 
This ends the argument leading to the appropriate scaling. □ 

Even within the setting of quadratic resources, the appro- 
priate choice for a does have an impact: If the probability of 
success p s is too small for a given a, 

1/Ps > a > 1, 

then this will lead to lim„^oo P$(ri) = 0, so the preparation of 
the cluster will eventually fail, asymptotically with certainty. 
This sudden change of the asymptotic behavior of the resource 
requirements, leading essentially to either almost unit (almost 
all cluster states can successfully be prepared) or almost van- 
ishing success probability is a simple threshold phenomenon 
as in percolation theory. In turn, for a given a, p t h = 1/a 
can be taken as a threshold probability: above this threshold 
almost all preparations will succeed, below it they will fail 
JHI. This number a essentially dictates the constant factor in 
front of the quadratic behavior in the scaling of the resource 
requirements. Needless to say, this depends on p s . 

This analysis shows that a two-dimensional cluster state can 
indeed be prepared using 0(n 2 ) edges, employing probabilis- 
tic quantum gates only. This can be viewed as good news, as 
it shows that the natural scaling of the use of such resources 
can indeed be met, with asymptotically negligible error. Pre- 
viously, only strategies leading to a super-quadratic resource 
consumption have been known. In turn, any such other scal- 
ing of the resources could have been viewed as a threat to the 
possibility of being able to prepare higher-dimensional cluster 
states using probabilistic quantum gates. 
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XI. SUMMARY, DISCUSSION, AND OUTLOOK 

In this paper, we have addressed the question of how to 
prepare cluster states using probabilistic gates. The emphasis 
was put on finding bounds that the optimal strategy neces- 
sarily has to satisfy, to identify final bounds on the resource 
overhead necessary in such a preparation. This issue is partic- 
ularly relevant in the context of linear optics, where the nec- 
essary overhead in resource is one of the major challenges in- 
herent in this type of architecture. It turns out that the way the 
classical strategy is chosen has a major impact on the resource 
consumption. By providing these rigorous bounds, we hope to 
give a guideline to the feasibility of probabilistic state genera- 
tion. One central observation, e. g., is that for any preparation 
of linear cluster states using linear optical gates as specified 
above, one necessarily needs at least five EPR pairs per av- 
erage gain of one edge. This limit can within these rules no 
further be undercut. But needless to say, the derived results 
are also applicable to other architectures, and we tried to sep- 
arate the general statements from those that focus specifically 
on linear optical setups. 

It is also the hope that the introduced tools and ideas are ap- 
plicable beyond the exact context discussed in the present pa- 
per. There are good reasons to believe that these methods may 
prove useful even when changing the rules: For example, as 
fusion type-II can also be used for production of redundancy 
encoding resource states Ja] and linear cluster states in a sim- 
ilar fashion, similar bounds to resource consumption may be 
derived for these schemes. Due to the fact that fusion type-II 
does not require photon number resolving detectors, this could 
be a matter of particular interest for experimental realizations. 
Also, generalizations of some of the statements forp s 7^ 1/2 
have been explicitly derived. Other generalizations may well 
also be proven with the tools developed in this paper. 



Concerning lossy operations, we emphasize again that 
when all EPR pairs are simultaneously created in the begin- 
ning, their storage time will be minimized by application of 
the strategy that optimizes the expected final length. Obvi- 
ously, the problem of storage using fiber loops or memories 
is a key issue in any realization. Yet, for a given loss mech- 
anism, it would be interesting to see to what extent a modi- 
fication of the optimal protocol would follow - compared to 
the one here assuming perfect operations - depending on the 
figures of merit chosen. One then expects trade-offs between 
different desiderata to become relevant l39ll . In the way it is 
done here, decoherence induced by the actual gates employed 
for the fusion process is also minimized exactly by choosing 
the optimal strategy of this work: it needs the least number of 
uses of the underlying quantum gates. 

Further, studies in the field of fault tolerance may well ben- 
efit from this approach. To start with, one has to be aware that 
the overhead induced in fully fault-tolerant one-way comput- 
ing schemes is quite enormous |2~ill . This is extenuated when 
considering photon loss only as a source of errors la, 12211 . Yet, 
methods as the ones presented here will be expected to be use- 
ful to very significantly reduce the number of gate invocations 
in the preparation of the resources. 
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